167 results found.
Written
Corpus,
Language Type:
Multilingual
Languages:
English Farsi French German Japanese
Availability:
Freely Available
License:
Size:
4.5 MByte Production Status:
Existing-updated
Use:
Document Classification, Text categorisation
-
Paper title:Multi-class Multilingual Classification of Wikipedia Articles Using Extended Named Entity Tag Set
-
Paper track:Evaluation/oral presentation
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Hassan S. Shavarani | Shinra-5LDS Dataset | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Multilingual
Languages:
Czech English French German
Availability:
Freely Available
License:
Not know yet
Size:
2 hoursProduction Status:
Newly created-in progress
Use:
Language Identification
-
Paper title:Detecting English Speech in the Air Traffic Control Voice Communication
-
Paper track:14.7 Automatic Speech Recognition in Air Traffic M/Poster Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Igor Szoke | ATCO2 ATC dataset | /N |
Documentation:
Not yet
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
Arabic English French German Greek Italian Portuguese Russian Spanish
Availability:
Freely Available
License:
CC BY-NC-ND 4.0
Size:
200 Production Status:
Newly created-finished
Use:
Corpus Creation/Annotation
-
Paper title:The Multilingual TEDx Corpus for Speech Recognition and Translation
-
Paper track:12.6 Speech and multimodal resources/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Elizabeth Salesky | Multilingual TEDx (mTEDx) | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
Bulgarian Croatian Czech French German Mandarin Polish Portuguese Spanish Thai Turkish
Availability:
From Data Center(s)
License:
ELRA
Size:
18.7 GByteProduction Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:Zero-shot Cross-Lingual Phonetic Recognition with External Language Embedding
-
Paper track:8.11 Cross-lingual and multilingual/accent aspects/Poster Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Heting Gao | GlobalPhone | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Monolingual
Languages:
Arabic Catalan Chinese Dutch Estonian French German Indonesian Italian Japanese Latvian Mongolian Persian Portuguese Russian Slovenian Spanish Swedish Tamil Turkish Welsh
Availability:
Freely Available
License:
CC0
Size:
2880 hoursProduction Status:
Newly created-in progress
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:CoVoST 2 and Massively Multilingual Speech Translation
-
Paper track:12.1 Spoken machine translation/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Juan Pino | CoVoST 2 | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Multilingual
Languages:
Arabic English Farsi French German Hindi Japanese Korean Mandarin Russian Spanish Tamil Vietnamese
Availability:
From Owner
License:
LDC
Size:
46 hoursProduction Status:
Existing-used
Use:
Language Identification
-
Paper title:Modeling and training strategies for language recognition systems
-
Paper track:4.1 Language identification and verification, lang/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Raphaël Duroselle | 2003 NIST Language Recognition Evaluation | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Multilingual
Languages:
Amharic Bosnian Croatian Dari English French Georgian Haitian Hausa Hindi Korean Mandarin Chinese Persian Portuguese Pushto Russian Spanish Turkish Ukrainian Urdu Vietnamese Yue Chinese
Availability:
From Owner
License:
LDC
Size:
215 hoursProduction Status:
Existing-used
Use:
Language Identification
-
Paper title:Modeling and training strategies for language recognition systems
-
Paper track:4.1 Language identification and verification, lang/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Raphaël Duroselle | 2009 NIST Language Recognition Evaluation Test Set | /N |
Documentation:
None
Speech
Corpus,
Language Type:
Monolingual
Languages:
French
Availability:
Not Available
License:
Size:
None Production Status:
Existing-used
Use:
Information Extraction, Information Retrieval
-
Paper title:Automatic Speech Recognition systems errors for objective sleepiness detection through voice
-
Paper track:3.4 Automatic analysis of speaker traits/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Vincent P. Martin | MSLT corpus | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Bilingual
Languages:
English French
Availability:
Freely Available
License:
OpenSource
Size:
12 GByte Production Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:End-to-End Speech Translation with Knowledge Distillation
-
Paper track:12.1 Spoken machine translation/Poster Presentation
-
Paper status:Accept - Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Yuchen Liu | Augmented LibriSpeech | /N |
Documentation:
None
Multimodal/Multimedia
Corpus,
Language Type:
Monolingual
Languages:
French
Availability:
Not Available
License:
Size:
2 hours Production Status:
Newly created-finished
Use:
Visual Speech Synthesis
-
Paper title:Modeling Labial Coarticulation with Bidirectional Gated Recurrent Networks and Transfer Learning
-
Paper track:7.15 Multimodal synthesis for avatars and talking /Oral Presentation
-
Paper status:Accept - Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Théo Biasutto--Lervat | CTN | /N |
Documentation:
None




